Search CORE

8 research outputs found

Recommended from our members

Speech Enabled Avatar from a Single Photograph

Author: Bitouk Dmitri
Nayar Shree K.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

This paper presents a complete framework for creating speech-enabled 2D and 3D avatars from a single image of a person. Our approach uses a generic facial motion model which represents deformations of the prototype face during speech. We have developed an HMM-based facial animation algorithm which takes into account both lexical stress and coarticulation. This algorithm produces realistic animations of the prototype facial surface from either text or speech. The generic facial motion model is transformed to a novel face geometry using a set of corresponding points between the generic mesh and the novel face. In the case of a 2D avatar, a single photograph of the person is used as input. We manually select a small number of features on the photograph and these are used to deform the prototype surface. The deformed surface is then used to animate the photograph. In the case of a 3D avatar, we use a single stereo image of the person as input. The sparse geometry of the face is computed from this image and used to warp the prototype surface to obtain the complete 3D surface of the person's face. This surface is etched into a glass cube using sub-surface laser engraving (SSLE) technology. Synthesized facial animation videos are then projected onto the etched glass cube. Even though the etched surface is static, the projection of facial animation onto it results in a compelling experience for the viewer. We show several examples of 2D and 3D avatars that are driven by text and speech inputs

Columbia University Academic Commons

Class-Level Spectral Features for Emotion Recognition

Author: Bitouk Dmitri
Nenkova Ani
Verma Ragini
Publication venue: ScholarlyCommons
Publication date: 01/01/2010
Field of study

The most common approaches to automatic emotion recognition rely on utterance-level prosodic features. Recent studies have shown that utterance-level statistics of segmental spectral features also contain rich information about expressivity and emotion. In our work we introduce a more fine-grained yet robust set of spectral features: statistics of Mel-Frequency Cepstral Coefficients computed over three phoneme type classes of interest – stressed vowels, unstressed vowels and consonants in the utterance. We investigate performance of our features in the task of speaker-independent emotion recognition using two publicly available datasets. Our experimental results clearly indicate that indeed both the richer set of spectral features and the differentiation between phoneme type classes are beneficial for the task. Classification accuracies are consistently higher for our features compared to prosodic or utterance-level spectral features. Combination of our phoneme class features with prosodic features leads to even further improvement. Given the large number of class-level spectral features, we expected feature selection will improve results even further, but none of several selection methods led to clear gains. Further analyses reveal that spectral features computed from consonant regions of the utterance contain more information about emotion than either stressed or unstressed vowel features. We also explore how emotion recognition accuracy depends on utterance length. We show that, while there is no significant dependence for utterance-level prosodic features, accuracy of emotion recognition using class-level spectral features increases with the utterance length

PubMed Central

ScholarlyCommons@Penn

Fourth Statistical Ensemble for the Bose-Einstein Condensate

Author: Bitouk Dmitri
Gajda Mariusz
Idziaszek Zbigniew
Navez Patrick
Rzazewski Kazimierz
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/1997
Field of study

DIAL UCLouvain

Face swapping

Author: Dmitri Bitouk
Neeraj Kumar
Peter Belhumeur
Samreen Dhillon
Shree K. Nayar
Wang Y.
Wen Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref